model search
Poodle: Seamlessly Scaling Down Large Language Models with Just-in-Time Model Replacement
Strassenburg, Nils, Glavic, Boris, Rabl, Tilmann
Businesses increasingly rely on large language models (LLMs) to automate simple repetitive tasks instead of developing custom machine learning models. LLMs require few, if any, training examples and can be utilized by users without expertise in model development. However, this comes at the cost of substantially higher resource and energy consumption compared to smaller models, which often achieve similar predictive performance for simple tasks. In this paper, we present our vision for just-in-time model replacement (JITR), where, upon identifying a recurring task in calls to an LLM, the model is replaced transparently with a cheaper alternative that performs well for this specific task. JITR retains the ease of use and low development effort of LLMs, while saving significant cost and energy. We discuss the main challenges in realizing our vision regarding the identification of recurring tasks and the creation of a custom model. Specifically, we argue that model search and transfer learning will play a crucial role in JITR to efficiently identify and fine-tune models for a recurring task. Using our JITR prototype Poodle, we achieve significant savings for exemplary tasks.
HM3: Heterogeneous Multi-Class Model Merging
Foundation language model deployments often include auxiliary guard-rail models to filter or classify text, detecting jailbreak attempts, biased or toxic output, or ensuring topic adherence. These additional models increase the complexity and cost of model inference, especially since many are also large language models. To address this issue, we explore training-free model merging techniques to consolidate these models into a single, multi-functional model. We propose Heterogeneous Multi-Class Model Merging (HM3) as a simple technique for merging multi-class classifiers with heterogeneous label spaces. Unlike parameter-efficient fine-tuning techniques like LoRA, which require extensive training and add complexity during inference, recent advancements allow models to be merged in a training-free manner. We report promising results for merging BERT-based guard models, some of which attain an average F1-score higher than the source models while reducing the inference time by up to 44%. We introduce self-merging to assess the impact of reduced task-vector density, finding that the more poorly performing hate speech classifier benefits from self-merging while higher-performing classifiers do not, which raises questions about using task vector reduction for model tuning.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Oregon (0.04)
- Europe > Monaco (0.04)
- Asia > Middle East > Jordan (0.04)
- Overview (0.93)
- Research Report (0.67)
OutRank: Speeding up AutoML-based Model Search for Large Sparse Data sets with Cardinality-aware Feature Ranking
The design of modern recommender systems relies on understanding which parts of the feature space are relevant for solving a given recommendation task. However, real-world data sets in this domain are often characterized by their large size, sparsity, and noise, making it challenging to identify meaningful signals. Feature ranking represents an efficient branch of algorithms that can help address these challenges by identifying the most informative features and facilitating the automated search for more compact and better-performing models (AutoML). We introduce OutRank, a system for versatile feature ranking and data quality-related anomaly detection. OutRank was built with categorical data in mind, utilizing a variant of mutual information that is normalized with regard to the noise produced by features of the same cardinality. We further extend the similarity measure by incorporating information on feature similarity and combined relevance. The proposed approach's feasibility is demonstrated by speeding up the state-of-the-art AutoML system on a synthetic data set with no performance loss. Furthermore, we considered a real-life click-through-rate prediction data set where it outperformed strong baselines such as random forest-based approaches. The proposed approach enables exploration of up to 300% larger feature spaces compared to AutoML-only approaches, enabling faster search for better models on off-the-shelf hardware.
Automating Machine Learning Pipelines
Creating a Machine Learning model is a difficult task because we need to write a lot of code to try different models and find out the performing model for that particular problem. There are different libraries that can automate this process to find out the best performing Machine Learning model but they also require some coding. What if I tell you that we can run multiple AutoML algorithms to find out the best model architecture for classification problems in a single code cell? Model search helps in implementing AutoML for classification problems. It runs multiple ML algorithms and compares them with each other.
Pre and Post Counting for Scalable Statistical-Relational Model Discovery
Statistical-Relational Model Discovery aims to find statistically relevant patterns in relational data. For example, a relational dependency pattern may stipulate that a user's gender is associated with the gender of their friends. As with propositional (non-relational) graphical models, the major scalability bottleneck for model discovery is computing instantiation counts: the number of times a relational pattern is instantiated in a database. Previous work on propositional learning utilized pre-counting or post-counting to solve this task. This paper takes a detailed look at the memory and speed trade-offs between pre-counting and post-counting strategies for relational learning. A pre-counting approach computes and caches instantiation counts for a large set of relational patterns before model search. A post-counting approach computes an instantiation count dynamically on-demand for each candidate pattern generated during the model search. We describe a novel hybrid approach, tailored to relational data, that achieves a sweet spot with pre-counting for patterns involving positive relationships (e.g. pairs of users who are friends) and post-counting for patterns involving negative relationships (e.g. pairs of users who are not friends). Our hybrid approach scales model discovery to millions of data facts.
- North America > Canada (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
AutoRec: An Automated Recommender System
Wang, Ting-Hsiang, Song, Qingquan, Han, Xiaotian, Liu, Zirui, Jin, Haifeng, Hu, Xia
For example, NCF [8] takes user-item implicit feedback data as inputs for the rating prediction task; and DeepFM [6] leverages both numerical and categorical data for the CTR prediction task. However, high degree of specialization comes at the expense of model adaptability and tuning complexity. As recommendation tasks evolve over time and additional types of data are collected, the originally apt model can either become obsolete or require tremendous tuning efforts. So far, several pipelines for recommender systems, e.g., OpenRec [16] and SMORe [4], tried to address the adaptability issue via providing modular base blocks that can be selected according to the context of recommendation. Nevertheless, both determining the blocks to use and tuning the model parameters are not straightforward when facing new data and changing tasks. In order to bridge the gap, we present AutoRec, which aims to provide an end-to-end solution to automate model selection and hyperparameter tuning. While many AutoML libraries, such as Auto-Sklearn [5] and TPOT [12] have shown promising results in general-purpose machine learning tasks (e.g., regression and hyperparameter tuning) and
- North America > United States > Texas (0.07)
- North America > United States > Florida > Hillsborough County > University (0.07)
- North America > United States > New York > New York County > New York City (0.04)
AutoML on Databricks: Augmenting Data Science from Data Prep to Operationalization - The Databricks Blog
Thousands of data science jobs are going unfilled today as global demand for the talent greatly outstrips supply. Every day, businesses pay the price of the data scientist shortage in missed opportunities and slow innovation. For organizations to realize the full potential of machine learning, data teams have to build hundreds of predictive models a year. For most enterprises, only a fraction of that number is actually achieved due to understaffed data science teams. Databricks can help data science teams be more productive by automating various steps of the data science workflow – including feature engineering, hyperparameter tuning, model search, and deployment – for a fully controlled and transparent augmented ML experience.